809 research outputs found

    Game-theoretical control with continuous action sets

    Full text link
    Motivated by the recent applications of game-theoretical learning techniques to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets, and we propose an actor-critic reinforcement learning algorithm that provably converges to equilibrium in this class of problems. The method employed is to analyse the learning process under study through a mean-field dynamical system that evolves in an infinite-dimensional function space (the space of probability distributions over the players' continuous controls). To do so, we extend the theory of finite-dimensional two-timescale stochastic approximation to an infinite-dimensional, Banach space setting, and we prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provably-convergent learning algorithm in which players do not need to keep track of the controls selected by the other agents.Comment: 19 page

    A phase transition for measure-valued SIR epidemic processes

    Full text link
    We consider measure-valued processes X=(Xt)X=(X_t) that solve the following martingale problem: for a given initial measure X0X_0, and for all smooth, compactly supported test functions φ\varphi, \begin{eqnarray*}X_t(\varphi )=X_0(\varphi)+\frac{1}{2}\int _0^tX_s(\Delta \varphi )\,ds+\theta \int_0^tX_s(\varphi )\,ds\\{}-\int_0^tX_s(L_s\varphi )\,ds+M_t(\varphi ).\end{eqnarray*} Here Ls(x)L_s(x) is the local time density process associated with XX, and Mt(φ)M_t(\varphi ) is a martingale with quadratic variation [M(φ)]t=∫0tXs(φ2) ds[M(\varphi )]_t=\int_0^tX_s(\varphi ^2)\,ds. Such processes arise as scaling limits of SIR epidemic models. We show that there exist critical values θc(d)∈(0,∞)\theta_c(d)\in(0,\infty) for dimensions d=2,3d=2,3 such that if θ>θc(d)\theta>\theta_c(d), then the solution survives forever with positive probability, but if θ<θc(d)\theta<\theta_c(d), then the solution dies out in finite time with probability 1. For d=1d=1 we prove that the solution dies out almost surely for all values of θ\theta. We also show that in dimensions d=2,3d=2,3 the process dies out locally almost surely for any value of θ\theta; that is, for any compact set KK, the process Xt(K)=0X_t(K)=0 eventually.Comment: Published in at http://dx.doi.org/10.1214/13-AOP846 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Supersymmetric Ward Identities and NMHV Amplitudes involving Gluinos

    Full text link
    We show how Supersymmetric Ward identities can be used to obtain amplitudes involving gluinos or adjoint scalars from purely gluonic amplitudes. We obtain results for all one-loop six-point NMHV amplitudes in \NeqFour Super Yang-Mills theory which involve two gluinos or two scalar particles. More general cases are also discussed.Comment: 32 pages, minor typos fixed; one reference adde

    Guide to Researching International Human Rights Law

    Get PDF

    Guide to Researching International Human Rights Law

    Get PDF

    Staff Perceptions of Standards-Based Grading Prior To Implementation

    Get PDF
    The purpose of this qualitative study was to evaluate the perceptions of a group of middle school teachers regarding changing to standards-based grading (SBG). Data were collected from the transcripts of two different focus groups and analyzed. Study results indicated that SBG measures were not well known by all staff, and many clear resistance points were present. Resistance points centered around five key themes: fear of loss of rigor, community pushback, lack of SBG practices knowledge, lack of supporting infrastructure, and extra time and work required. Recommendations that flow from these results are that, prior to implementing SBG, comprehensive data be collected regarding staff beliefs about grading and reporting in general, and that targeted, differentiated professional development be planned for staff based upon the results of the data collected. Continuing to expand SBG practices within schools is the ultimate goal due to the large body of research espousing its benefits

    Asynchronous Stochastic Approximation with Differential Inclusions

    Get PDF
    The asymptotic pseudo-trajectory approach to stochastic approximation of Benaim, Hofbauer and Sorin is extended for asynchronous stochastic approximations with a set-valued mean field. The asynchronicity of the process is incorporated into the mean field to produce convergence results which remain similar to those of an equivalent synchronous process. In addition, this allows many of the restrictive assumptions previously associated with asynchronous stochastic approximation to be removed. The framework is extended for a coupled asynchronous stochastic approximation process with set-valued mean fields. Two-timescales arguments are used here in a similar manner to the original work in this area by Borkar. The applicability of this approach is demonstrated through learning in a Markov decision process.Comment: 41 page

    Best-response Dynamics in Zero-sum Stochastic Games

    Get PDF
    We define and analyse three learning dynamics for two-player zero-sum discounted-payoff stochastic games. A continuous-time best-response dynamic in mixed strategies is proved to converge to the set of Nash equilibrium stationary strategies. Extending this, we introduce a fictitious-play-like process in a continuous-time embedding of a stochastic zero-sum game, which is again shown to converge to the set of Nash equilibrium strategies. Finally, we present a modified δ-converging best-response dynamic, in which the discount rate converges to 1, and the learned value converges to the asymptotic value of the zero-sum stochastic game. The critical feature of all the dynamic processes is a separation of adaption rates: beliefs about the value of states adapt more slowly than the strategies adapt, and in the case of the δ-converging dynamic the discount rate adapts more slowly than everything else

    Mixed-strategy learning with continuous action sets

    Get PDF
    Motivated by the recent applications of game-theoretical learning to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets. We propose an actor-critic reinforcement learning algorithm that adapts mixed strategies over continuous action spaces. To analyse the algorithm we extend the theory of finite-dimensional two-timescale stochastic approximation to a Banach space setting, and prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provablyconvergent learning algorithm in which players do not need to keep track of the controls selected by other agents
    • …
    corecore